Search CORE

29 research outputs found

Recognizing Uncertainty in Speech

Author: Pon-Barry Heather
Shieber Stuart M.
Publication venue: 'Hindawi Limited'
Publication date: 01/12/2010
Field of study

We address the problem of inferring a speaker's level of certainty based on prosodic information in the speech signal, which has application in speech-based dialogue systems. We show that using phrase-level prosodic features centered around the phrases causing uncertainty, in addition to utterance-level prosodic features, improves our model's level of certainty classification. In addition, our models can be used to predict which phrase a person is uncertain about. These results rely on a novel method for eliciting utterances of varying levels of certainty that allows us to compare the utility of contextually-based feature sets. We elicit level of certainty ratings from both the speakers themselves and a panel of listeners, finding that there is often a mismatch between speakers' internal states and their perceived states, and highlighting the importance of this distinction.Comment: 11 page

arXiv.org e-Print Archive

Crossref

Harvard University - DASH

Springer - Publisher Connector

Directory of Open Access Journals

Finding Eyewitness Tweets During Crises

Author: Liu Huan
Lubold Nichola
Morstatter Fred
Pfeffer Jürgen
Pon-Barry Heather
Publication venue
Publication date: 01/01/2014
Field of study

Disaster response agencies have started to incorporate social media as a source of fast-breaking information to understand the needs of people affected by the many crises that occur around the world. These agencies look for tweets from within the region affected by the crisis to get the latest updates of the status of the affected region. However only 1% of all tweets are geotagged with explicit location information. First responders lose valuable information because they cannot assess the origin of many of the tweets they collect. In this work we seek to identify non-geotagged tweets that originate from within the crisis region. Towards this, we address three questions: (1) is there a difference between the language of tweets originating within a crisis region and tweets originating outside the region, (2) what are the linguistic patterns that can be used to differentiate within-region and outside-region tweets, and (3) for non-geotagged tweets, can we automatically identify those originating within the crisis region in real-time

arXiv.org e-Print Archive

CiteSeerX

Crossref

Teaching TAs To Teach: Strategies for TA Training

Author: Ball Michael
Blank Adam
DeOrio Andrew
Hsia Justin
Pon-Barry Heather
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/02/2020
Field of study

"The only thing that scales with undergrads is undergrads". As Computer Science course enrollments have grown, there has been a necessary increase in the number of undergraduate and graduate teaching assistants (TAs, and UTAs). TA duties often extend far beyond grading, including designing and leading lab or recitation sections, holding office hours and creating assignments. Though advanced students, TAs need proper pedagogical training to be the most effective in their roles. Training strategies have widely varied from no training at all, to semester-long prep courses. We will explore the challenges of TA training across both large and small departments. While much of the effort has focused on teams of undergraduates, most presenters have used the same tools and strategies with their graduate students. Training for TAs should not just include the mechanics of managing a classroom, but culturally relevant pedagogy. The panel will focus on the challenges of providing "just in time", and how we manage both intra-course training and department or campus led courses

Crossref

Caltech Authors

Recommended from our members

The Importance of Sub-Utterance Prosody in Predicting Level of Certainty

Author: Pon-Barry Heather Roberta
Shieber Stuart M.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 22/02/2011
Field of study

We present an experiment aimed at understanding how to optimally use acoustic and prosodic information to predict a speaker's level of certainty. With a corpus of utterances where we can isolate a single word or phrase that is responsible for the speaker's level of certainty we use different sets of sub-utterance prosodic features to train models for predicting an utterance's perceived level of certainty. Our results suggest that using prosodic features of the word or phrase responsible for the level of certainty and of its surrounding context improves the prediction accuracy without increasing the total number of features when compared to using only features taken from the utterance as a whole.Engineering and Applied Science

Harvard University - DASH

Recommended from our members

Identifying Uncertain Words within an Utterance via Prosodic Features

Author: Pon-Barry Heather Roberta
Shieber Stuart M.
Publication venue: 'International Speech Communication Association'
Publication date: 22/02/2011
Field of study

We describe an experiment that investigates whether sub-utterance prosodic features can be used to detect uncertainty at the wordlevel. That is, given an utterance that is classified as uncertain, we want to determine which word or phrase the speaker is uncertain about. We have a corpus of utterances spoken under varying degrees of certainty. Using combinations of sub-utterance prosodic features we train models to predict the level of certainty of an utterance. On a set of utterances that were perceived to be uncertain, we compare the predictions of our models for two candidate target word segmentations: (a) one with the actual word causing uncertainty as the proposed target word, and (b) one with a control word as the proposed target word. Our best model correctly identifies the word causing the uncertainty rather than the control word 91% of the time.Engineering and Applied Science

Harvard University - DASH

Recommended from our members

Inferring Speaker Affect in Spoken Natural Language Communication

Author: Pon-Barry Heather Roberta
Publication venue: 'Harvard University Botany Libraries'
Publication date: 15/03/2013
Field of study

The ﬁeld of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards machine listening—interpreting the full spectrum of information in an audio stream. One part of machine listening, the problem that this thesis focuses on, is the task of using information in the speech signal to infer a person’s emotional or mental state. In this dissertation, our approach is to assess the utility of prosody, or manner of speaking, in classifying speaker affect. Prosody refers to the acoustic features of natural speech: rhythm, stress, intonation, and energy. Affect refers to a person’s emotions and attitudes such as happiness, frustration, or uncertainty. We focus on one specific dimension of affect: level of certainty. Our goal is to automatically infer whether a person is conﬁdent or uncertain based on the prosody of his or her speech. Potential applications include conversational dialogue systems (e.g., in educational technology) and voice search (e.g., smartphone personal assistants). There are three main contributions of this thesis. The first contribution is a method for eliciting uncertain speech that binds a speaker’s uncertainty to a single phrase within the larger utterance, allowing us to compare the utility of contextually-based prosodic features. Second, we devise a technique for computing prosodic features from utterance segments that both improves uncertainty classification and can be used to determine which phrase a speaker is uncertain about. The level of certainty classifier achieves an accuracy of 75%. Third, we examine the differences between perceived, self-reported, and internal level of certainty, concluding that perceived certainty is aligned with internal certainty for some but not all speakers and that self-reports are a good proxy for internal certainty.Engineering and Applied Science

Harvard University - DASH

Disordered speech disrupts conversational entrainment: a study of acoustic-prosodic entrainment and communicative success in populations with communication challenges

Author: Borrie Stephanie A.
Lubold Nichola
Pon-Barry Heather
Publication venue: Hosted by Utah State University Libraries
Publication date: 01/01/2015
Field of study

Conversational entrainment, a pervasive communication phenomenon in which dialogue partners adapt their behaviors to align more closely with one another, is considered essential for successful spoken interaction. While well-established in other disciplines, this phenomenon has received limited attention in the field of speech pathology and the study of communication breakdowns in clinical populations. The current study examined acoustic-prosodic entrainment, as well as a measure of communicative success, in three distinctly different dialogue groups: (i) healthy native vs. healthy native speakers (Control), (ii) healthy native vs. foreign-accented speakers (Accented), and (iii) healthy native vs. dysarthric speakers (Disordered). Dialogue group comparisons revealed significant differences in how the groups entrain on particular acoustic–prosodic features, including pitch, intensity, and jitter. Most notably, the Disordered dialogues were characterized by significantly less acoustic-prosodic entrainment than the Control dialogues. Further, a positive relationship between entrainment indices and communicative success was identified. These results suggest that the study of conversational entrainment in speech pathology will have essential implications for both scientific theory and clinical application in this domain

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

DigitalCommons@USU

Recommended from our members

Eliciting and annotating uncertainty in spoken language

Author: Longenbaugh Nicholas Steven
Pon-Barry Heather
Shieber Stuart Merrill
Publication venue
Publication date: 02/05/2014
Field of study

A major challenge in the ﬁeld of automatic recognition of emotion and affect in speech is the subjective nature of affect labels. The most common approach to acquiring affect labels is to ask a panel of listeners to rate a corpus of spoken utterances along one or more dimensions of interest. For applications ranging from educational technology to voice search to dictation, a speaker’s level of certainty is a primary dimension of interest. In such applications, we would like to know the speaker’s actual level of certainty, but past research has only revealed listeners’ perception of the speaker’s level of certainty. In this paper, we present a method for eliciting spoken utterances using stimuli that we design such that they have a quantitative, crowdsourced legibility score. While we cannot control a speaker’s actual internal level of certainty, the use of these stimuli provides a better estimate of internal certainty compared to existing speech corpora. The Harvard Uncertainty Speech Corpus, containing speech data, certainty annotations, and prosodic features, is made available to the research community.Engineering and Applied Science

Harvard University - DASH